The concept of document warehousing for multi-dimensional modeling of textual-based business intelligence

نویسندگان

  • Frank S. C. Tseng
  • Annie Y. H. Chou
چکیده

During the past decade, data warehousing has been widely adopted in the business community. It provides multidimensional analyses on cumulated historical business data for helping contemporary administrative decision-making. Nevertheless, it is believed that only about 20% information can be extracted from data warehouses concerning numeric data only, the other 80% information is hidden in non-numeric data or even in documents. Therefore, many researchers now advocate that it is time to conduct research work on document warehousing to capture complete business intelligence. Document warehouses, unlike traditional document management systems, include extensive semantic information about documents, cross-document feature relations, and document grouping or clustering to provide a more accurate and more efficient access to text-oriented business intelligence. In this paper, we discuss the basic concept of document warehousing and present its formal definitions. Then, we propose a general system framework and elaborate some useful applications to illustrate the importance of document warehousing. The work is essential for establishing an infrastructure to help combine text processing with numeric OLAP processing technologies. The combination of data warehousing and document warehousing will be one of the most important kernels of knowledge management and customer relationship management applications. D 2005 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Concept of Document Warehousing and Its Applications on Managing Enterprise Business Intelligence

During the past decade, data warehousing has been widely adopted in the business community. It provides multi-dimensional analyses on cumulated historical business data for helping contemporary administrative decision-makings. Nevertheless, it is believed there is only about 20% information can be extracted from data warehouses concerning numeric data only, the other 80% information is hidden i...

متن کامل

D-Tree: A Multi-Dimensional Indexing Structure for Constructing Document Warehouses

Document warehouses, unlike traditional document management systems, contain extensive semantic information about documents, cross-document feature relations, and document grouping or clustering, thus providing an accurate and efficient access to business intelligence information. Since documents are multi-dimensional in nature, we claim that traditional indexing methods are not really suitable...

متن کامل

Injection Optimization for Heavy Duty Diesel Engine in Order to Find High Efficiency and Low NOx Engine Concept by Means of Quasi Dimensional Multi-Zone Spray Modeling

The purpose of this study is to investigate the effect of injection parameters on a heavy duty diesel engine performance and emission characteristics. In order to analyze the injection and spray characteristics of diesel fuel with employing high-pressure common-rail injection system, the injection characteristics such as injection delay, injection duration, injection rate, number of nozzle hole...

متن کامل

The Comparison of Anchor and Star Schema from a Query Performance Perspective

Today's business environment requires that companies have access to highly relevant information in a matter of seconds. Modern Business Intelligence tools rely on data structured mostly in traditional dimensional database schemas, typically represented by star schemas. Dimensional modeling is already recognized as a leading industry standard in the field of data warehousing although several dra...

متن کامل

An Ensemble Click Model for Web Document Ranking

Annually, web search engine providers spend more and more money on documents ranking in search engines result pages (SERP). Click models provide advantageous information for ranking documents in SERPs through modeling interactions among users and search engines. Here, three modules are employed to create a hybrid click model; the first module is a PGM-based click model, the second module in a d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Decision Support Systems

دوره 42  شماره 

صفحات  -

تاریخ انتشار 2006